22 research outputs found
Identifying the structure patterns to govern the performance of localization in regulating innovation diffusion
The macro social influence is recognized as a non-negligible ingredient in
innovation propagation: more adopters in the network lead to a higher adoption
tendency for the rest individuals. A recent study to incorporate such a crucial
mechanism shows that sufficient intensity of macro-level social influence can
cause a change from a continuous to discontinuous transition, further
indicating the existence of a tricritical point. Although network localization
strength determines the tricritical point, it remains unclear what network
quantities govern the performance of localization in regulating innovation
diffusion. To address this issue, we herein consider the model incorporating
both the micro- and macro-levels social influence. We present a dynamic
message-passing method to analytically treat both the outbreak threshold and
recovered population, and validate the predictions through agent-based
simulations. Extensive analysis on the classical synthetic networks shows that
sparsely available connections, and relatively heterogeneous degree
distribution, either assortative or extremely disassortative configurations are
favorable for continuous transition. In such cases, the employed network can
yield a strong localization effect so that the innovation is trapped in the
configurations composed of the hubs with high non-backtracking centrality. We
further explore the dependence of both tricritical point and localization
strength on three structural quantities: network density, heterogeneity, and
assortativity, which gives a clear physical picture of the joint effects of the
three structure quantities on the localization strength. Finally, we conclude
that the core-periphery structure, being sensitive to the change of the three
structure quantities, essentially determines localization strength, and further
regulates the phase transition.Comment: 23 pages, 10 figures, 1 table
Hierarchically-Refined Label Attention Network for Sequence Labeling
CRF has been used as a powerful model for statistical sequence labeling. For
neural sequence labeling, however, BiLSTM-CRF does not always lead to better
results compared with BiLSTM-softmax local classification. This can be because
the simple Markov label transition model of CRF does not give much information
gain over strong neural encoding. For better representing label sequences, we
investigate a hierarchically-refined label attention network, which explicitly
leverages label embeddings and captures potential long-term label dependency by
giving each word incrementally refined label distributions with hierarchical
attention. Results on POS tagging, NER and CCG supertagging show that the
proposed model not only improves the overall tagging accuracy with similar
number of parameters, but also significantly speeds up the training and testing
compared to BiLSTM-CRF.Comment: EMNLP 201
Collaborative Evaluation: Exploring the Synergy of Large Language Models and Humans for Open-ended Generation Evaluation
Humans are widely involved in the evaluation of open-ended natural language
generation tasks (NLG) that demand creativity, as automatic metrics often
exhibit weak correlations with human judgments. Large language models (LLMs)
recently have emerged as a scalable and cost-effective alternative to human
evaluations. However, both humans and LLMs have limitations, i.e., inherent
subjectivity and unreliable judgments, particularly for open-ended tasks that
require adaptable metrics tailored to diverse task requirements. To explore the
synergy between humans and LLM-based evaluators and address the challenges of
existing inconsistent evaluation criteria in open-ended NLG tasks, we propose a
Collaborative Evaluation pipeline CoEval, involving the design of a checklist
of task-specific criteria and the detailed evaluation of texts, in which LLM
generates initial ideation, and then humans engage in scrutiny. We conducted a
series of experiments to investigate the mutual effects between LLMs and humans
in CoEval. Results show that, by utilizing LLMs, CoEval effectively evaluates
lengthy texts, saving significant time and reducing human evaluation outliers.
Human scrutiny still plays a role, revising around 20% of LLM evaluation scores
for ultimate reliability.Comment: We release our resources at \url{https://github.com/qtli/CoEval
Automated Action Model Acquisition from Narrative Texts
Action models, which take the form of precondition/effect axioms, facilitate
causal and motivational connections between actions for AI agents. Action model
acquisition has been identified as a bottleneck in the application of planning
technology, especially within narrative planning. Acquiring action models from
narrative texts in an automated way is essential, but challenging because of
the inherent complexities of such texts. We present NaRuto, a system that
extracts structured events from narrative text and subsequently generates
planning-language-style action models based on predictions of commonsense event
relations, as well as textual contradictions and similarities, in an
unsupervised manner. Experimental results in classical narrative planning
domains show that NaRuto can generate action models of significantly better
quality than existing fully automated methods, and even on par with those of
semi-automated methods.Comment: 10 pages, 3 figure
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance
Proprietary Large Language Models (LLMs), such as ChatGPT, have garnered
significant attention due to their exceptional capabilities in handling a
diverse range of tasks. Recent studies demonstrate that open-sourced smaller
foundational models, such as 7B-size LLaMA, can also display remarkable
proficiency in tackling diverse tasks when fine-tuned using instruction-driven
data. In this work, we investigate a practical problem setting where the
primary focus is on one or a few particular tasks rather than general-purpose
instruction following, and explore whether LLMs can be beneficial and further
improved for such targeted scenarios. We choose the writing-assistant scenario
as the testbed, which includes seven writing tasks. We collect training data
for these tasks, reframe them in an instruction-following format, and
subsequently refine the LLM, specifically LLaMA, via instruction tuning.
Experimental results show that fine-tuning LLaMA on writing instruction data
significantly improves its ability on writing tasks. We also conduct more
experiments and analyses to offer insights for future work on effectively
fine-tuning LLaMA for specific scenarios. Finally, we initiate a discussion
regarding the necessity of employing LLMs for only one targeted task, taking
into account the efforts required for tuning and the resources consumed during
deployment
LogiCoT: Logical Chain-of-Thought Instruction-Tuning
Generative Pre-trained Transformer 4 (GPT-4) demonstrates impressive
chain-of-thought reasoning ability. Recent work on self-instruction tuning,
such as Alpaca, has focused on enhancing the general proficiency of models.
These instructions enable the model to achieve performance comparable to
GPT-3.5 on general tasks like open-domain text generation and paraphrasing.
However, they fall short of helping the model handle complex reasoning tasks.
To bridge the gap, this paper presents LogiCoT, a new instruction-tuning
dataset for Logical Chain-of-Thought reasoning with GPT-4. We elaborate on the
process of harvesting instructions for prompting GPT-4 to generate
chain-of-thought rationales. LogiCoT serves as an instruction set for teaching
models of logical reasoning and elicits general reasoning skills